{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "C9LspitiAK83"
   },
   "source": [
    "## Markov Processes\n",
    "\n",
    "### Markov Chain - Continuing\n",
    "\n",
    "These are the chains which have no terminations.\n",
    "\n",
    "![Markov Chain](https://raw.githubusercontent.com/nsanghi/drl-2ed/main/chapter2/images/mcchain_continuing.png \"Markov Chain\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "id": "mJGPHIKfAK86"
   },
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Transition Matrix:\n",
      " [[0.3 0.7]\n",
      " [0.2 0.8]]\n",
      "\n",
      "Iter 0. State Probability vector S = [0.25 0.75]\n",
      "\n",
      "Iter 5. State Probability vector S = [0.2222225 0.7777775]\n",
      "\n",
      "Final Vector S=[0.22222222 0.77777778]\n"
     ]
    }
   ],
   "source": [
    "# MC with no end\n",
    "\n",
    "# import numpy library to do vector algebra\n",
    "import numpy as np\n",
    "\n",
    "# define transition matrix\n",
    "P = np.array([[0.3, 0.7], [0.2, 0.8]])\n",
    "print(\"Transition Matrix:\\n\", P)\n",
    "\n",
    "# define a random starting solution for state probabilities\n",
    "# Here we assume equal probabilities for all the states\n",
    "S = np.array([0.5, 0.5])\n",
    "\n",
    "# run through 10 iterations to calculate steady state\n",
    "# transition probabilities\n",
    "for i in range(10):\n",
    "    S = np.dot(S, P)\n",
    "    if i % 5 == 0:\n",
    "        print(\"\\nIter {0}. State Probability vector S = {1}\".format(i, S))\n",
    "\n",
    "\n",
    "print(\"\\nFinal Vector S={0}\".format(S))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Kz4UtBbdAK89"
   },
   "source": [
    "### Markov Chain - Episodic\n",
    "\n",
    "\n",
    "![Markov Chain Episodic](https://raw.githubusercontent.com/nsanghi/drl-2ed/main/chapter2/images/mc_episodic.png \"Markov Chain Episodic\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "jKWlO0R5AK8-",
    "outputId": "fd0bc6be-a2c4-433f-ad0b-949e211edacc"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Transition Matrix:\n",
      " [[0.3 0.5 0.2 0. ]\n",
      " [0.1 0.9 0.  0. ]\n",
      " [0.4 0.  0.  0.6]\n",
      " [0.  0.  0.  1. ]]\n",
      "\n",
      "Final Vector S=[4.73405766e-09 2.84857504e-08 9.63092427e-10 9.99999966e-01]\n"
     ]
    }
   ],
   "source": [
    "# MC Episodic\n",
    "\n",
    "# import numpy library to do vector algebra\n",
    "import numpy as np\n",
    "\n",
    "# define transition matrix\n",
    "P = np.array([\n",
    "    [0.3, 0.5, 0.2, 0.0],\n",
    "    [0.1, 0.9, 0.0, 0.0],\n",
    "    [0.4, 0, 0, 0.6],\n",
    "    [0, 0, 0, 1]\n",
    "])\n",
    "\n",
    "print(\"Transition Matrix:\\n\", P)\n",
    "\n",
    "# define any starting solution to state probabilities\n",
    "# Here we assume equal probabilities for all the states\n",
    "S = np.array([1, 0, 0, 0])\n",
    "\n",
    "# run through 10 iterations to calculate steady state\n",
    "# transition probabilities which should give prob ~ 1 for terminal state\n",
    "for i in range(1000):\n",
    "    S = np.dot(S, P)\n",
    "    # print(\"\\nIter {0}. Probability vector S = {1}\".format(i, S))\n",
    "\n",
    "\n",
    "print(\"\\nFinal Vector S={0}\".format(S))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "LtONXG4JAK8_"
   },
   "source": [
    "### Markov Reward Process - Continuing\n",
    "\n",
    "Now the process has a reward for each transition in addition to the transition probabilities.\n",
    "\n",
    "![Markov Reward Process](https://raw.githubusercontent.com/nsanghi/drl-2ed/main/chapter2/images/mrp_continuing.png \"Markov Reward Process\")"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
